An F0 contour control model for totally speaker driven text to speech system

نویسندگان

  • Takehiko Kagoshima
  • Masahiro Morita
  • Shigenobu Seto
  • Masami Akamine
چکیده

Totally Speaker Driven Text to Speech System produces high quality and natural speech resembling the acoustic and prosodic characteristics of the original speech corpus. In the F0 contour control of this system, an F0 contour of a whole sentence is produced by concatenating segmental F0 contours generated by modifying vectors that are representatives of typical F0 contours. The representative vectors are selected from the F0 contour codebook, which is designed so as to minimize the approximation error between F0 contours generated by the proposed model and real F0 contours extracted from a speech corpus. It was con rmed by experiments with Japanese speech corpus that F0 contours can be modeled with small approximation errors by only 48 representative vectors, and the synthetic speech sounded very natural and resembled the prosodic characteristics of the original speaker.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Smooth contour estimation in data-driven pitch modelling

Apple's next-generation text-to-speech (TTS) system in MacOS X uses a superpositional pitch model, comprising a relatively smooth underlying F0 contour and a separate contribution from the in uence of the phonetic segments. This paper focuses on the data-driven modelling of the underlying contour, based on electroglottographic signals obtained from a corpus of reiterant speech. F0 extraction fr...

متن کامل

Generation of F0 contour using stochastic mapping and vector quantization control parameters

This paper introduces an F0 contour generation method for text-to-speech synthesis using stochastic mapping and vector quantization control parameters. This model uses a new F0 contour labelling scheme based on the RFC (Rise/Fall/Connection) model [1], which describes F0 contour patterns with seven F0 labels and three pause labels. This paper also suggests an e cient selection method for contro...

متن کامل

Realization of Prosodic Focuses in Corpus-based Generation of Fundamental Frequency Contours of Japanese Based on the Generation Process Model

A method was developed for generating sentence F0 contours of Japanese, when a focus is placed in one of the “bunsetsu” of an utterance. It controls F0 based on the F0 model; not frame-byframe F0 prediction as in the case of HMM-based speech synthesis. The method first predicts differences in the F0 model commands between utterances with and without focus, and then applies them to the F0 model ...

متن کامل

Fujisaki model based F0 contours in vietnamese TTS

The current paper presents preliminary work towards the integration of the Fujisaki model into the VnVoice Vietnamese TTS system, based on a set of rules to control the F0 contour. A speech corpus consisting of 20 sentences was compiled. Each of the sentences can have various meanings depending on the tone associated with a monosyllabic keyword which it contains. The corpus with a total of 46 s...

متن کامل

Generating F0 contours by statistical manipulation of natural F0 shapes

This paper proposes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of F0 units are basically kept unchanged, by eliminating any averaging operation in the analysis phase and minimizing modification operations in the synthesis phase. The use of “kept-unchanged” F0 shapes has a great potential to incorporate a wide variety of speakin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998